[Others] Add NUM_MAX_DISPATCH_TOKENS_PER_RANK env to control#7188
[Others] Add NUM_MAX_DISPATCH_TOKENS_PER_RANK env to control#7188RichardWooSJTU wants to merge 1 commit intoPaddlePaddle:developfrom
Conversation
|
Thanks for your contribution! |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #7188 +/- ##
==========================================
Coverage ? 73.24%
==========================================
Files ? 376
Lines ? 52946
Branches ? 8263
==========================================
Hits ? 38779
Misses ? 11453
Partials ? 2714
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
fastdeploy-bot
left a comment
There was a problem hiding this comment.
🤖 AI Code Review |
2026-04-09
📋 Review 摘要
PR 概述:新增 NUM_MAX_DISPATCH_TOKENS_PER_RANK 环境变量,用于控制 MoE 计算中每 rank 的最大调度 token 数量。
变更范围:fastdeploy/envs.py、fastdeploy/config.py
影响面 Tag:[Others]
📝 PR 规范检查
标题:使用了 [Others] 标签,符合规范。
描述:Motivation 和 Modifications 部分仅包含模板文字,未填写实际内容,不符合规范。
描述模板(请补充):
## Motivation
[说明为什么要添加这个环境变量,例如:允许在不修改模型配置的情况下调整 MoE 调度参数,便于性能调优...]
## Modifications
[详细说明修改内容,例如:1. 在 envs.py 中新增 NUM_MAX_DISPATCH_TOKENS_PER_RANK 环境变量;2. 在 config.py 中添加一致性检查逻辑...]问题
| 级别 | 文件 | 概述 |
|---|---|---|
| 🟡 建议 | fastdeploy/envs.py:271 |
环境变量命名不符合项目规范 |
| 🟡 建议 | fastdeploy/envs.py:272 |
缺少对环境变量值的验证 |
| 🟡 建议 | fastdeploy/config.py:270 |
缺少单元测试覆盖新增逻辑 |
总体评价
代码逻辑清晰,实现了环境变量与模型配置的一致性检查机制。但存在命名规范、输入验证和测试覆盖方面的问题需要改进。
| ), | ||
| # Number of max dispatch tokens per rank for MoE computation. | ||
| # If set, it must match the value in model config if present, otherwise an error will be raised. | ||
| "NUM_MAX_DISPATCH_TOKENS_PER_RANK": lambda: ( |
There was a problem hiding this comment.
🟡 建议 环境变量命名不符合项目规范。
项目中的环境变量统一使用 FD_ 前缀(如 FD_MOE_BACKEND、FD_MOE_MXFP4_BACKEND),建议将此变量重命名为 FD_NUM_MAX_DISPATCH_TOKENS_PER_RANK。
| # Number of max dispatch tokens per rank for MoE computation. | ||
| # If set, it must match the value in model config if present, otherwise an error will be raised. | ||
| "NUM_MAX_DISPATCH_TOKENS_PER_RANK": lambda: ( | ||
| int(os.getenv("NUM_MAX_DISPATCH_TOKENS_PER_RANK", "0")) |
There was a problem hiding this comment.
🟡 建议 缺少对环境变量值的验证。
当环境变量设置为 0 或负数时可能导致问题。建议添加验证:
"NUM_MAX_DISPATCH_TOKENS_PER_RANK": lambda: (
int(value) if (value := os.getenv("NUM_MAX_DISPATCH_TOKENS_PER_RANK")) and int(value) > 0
else None
),| ) | ||
| else: | ||
| # Use env value if model config doesn't explicitly set it | ||
| self.num_max_dispatch_tokens_per_rank = env_num_max_dispatch |
There was a problem hiding this comment.
🟡 建议 缺少单元测试覆盖新增逻辑。
建议在 tests/utils/test_config.py 中添加测试用例,验证以下场景:
- 环境变量未设置时使用默认值 128
- 环境变量设置时,与模型配置不一致时抛出 ValueError
- 环境变量设置时,模型配置未设置时使用环境变量值
Motivation
当前低时延EP通信所需要的num_max_dispatch_tokens_per_rank参数只能通过model目录中的
config.json指定,在上线等场景下不方便改动。Modifications
增加
NUM_MAX_DISPATCH_TOKENS_PER_RANK环境变量,和config.json中的配置优先级相同,如果和config.json中的配置冲突的话会抛出错误。Usage or Command
export NUM_MAX_DISPATCH_TOKENS_PER_RANK=256Accuracy Tests
Checklist
[FDConfig],[APIServer],[Engine],[Scheduler],[PD Disaggregation],[Executor],[Graph Optimization],[Speculative Decoding],[RL],[Models],[Quantization],[Loader],[OP],[KVCache],[DataProcessor],[BugFix],[Docs],[CI],[Optimization],[Feature],[Benchmark],[Others],[XPU],[HPU],[GCU],[DCU],[Iluvatar],[Metax]]pre-commitbefore commit.releasebranch, make sure the PR has been submitted to thedevelopbranch, then cherry-pick it to thereleasebranch with the[Cherry-Pick]PR tag.